Differences in Time Delay between Search Engine Crawlers at Web Sites
نویسندگان
چکیده
Web log mining provides tremendous information about user traffic and search engine behavior at web sites. The behavior of search engines could be used in analyzing server load, quality of search engines, dynamics of search engine crawlers, ethics of search engines etc. Search engine crawlers are highly automated programs which are seldom regulated manually. These crawlers periodically visit the web sites to collect the information. The dynamicity of search engine crawlers could be identified with the time delay between two consecutive visits. The more the visits of a crawler to a web site, the more it contributes to the server load. We intend to see whether there is a significant difference in the time delay between visits of a search engine crawler. Similarly the time delay between visits of various search engine crawlers is also analyzed to identify the differences in their behavior.
منابع مشابه
Analysis of the Temporal Behaviour of Search Engine Crawlers at Web Sites
Web log mining is the extraction of web logs to analyze user behaviour at web sites. In addition to user information, web logs provide immense information about search engine traffic and behaviour. Search engine crawlers are highly automated programs that periodically visit the web site to collect information. The behaviour of search engines could be used in analyzing server load, quality of se...
متن کاملMining Web Logs to Identify Search Engine Behaviour at Websites
Web Usage Mining also known as Web Log Mining is the extraction of user behaviour from web log data. The log files also provide immense information about the search engine traffic at a website. This search engine traffic is helpful to analyse the ethics of search engines, quality of the crawlers, periodicity of the visits and also the server load. Search engine crawlers are automated programs w...
متن کاملWorkload-Aware Web Crawling and Server Workload Detection
With the development of search engines, more and more web crawlers are used to gather web pages. The rising crawling traffic has brought the concern that crawlers may impact web sites. On the other hand, more efficient crawling strategy is required for the coverage and freshness of search engine index. In this paper, crawlers of several major search engines are analyzed using one six-months acc...
متن کاملAn investigation of web crawler behavior: characterization and metrics
In this paper, we present a characterization study of search-engine crawlers. For the purposes of our work, we use Web-server access logs from five academic sites in three different countries. Based on these logs, we analyze the activity of different crawlers that belong to five search engines: Google, AltaVista, Inktomi, FastSearch and CiteSeer. We compare crawler behavior to the characteristi...
متن کاملAlgorithm for Merging Search Interfaces over Hidden Web
This is the world of information. The size of world wide web [4,5] is growing at an exponential rate day by day. The information on the web is accessed through search engine. These search engines [8] uses web crawlers to prepare the repository and update that index at an regular interval. These web crawlers [3, 6] are the heart of search engines. Web crawlers continuously keep on crawling the w...
متن کامل